Locally Optimal Control of Quantum Systems with Strong Feedback 
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For quantum systems with high purity, we find all observables that, when continuously monitored, 
maximize the instantaneous reduction in the average linear entropy. This allows us to obtain all 
locally optimal feedback protocols with strong feedback, and explicit expressions for the best such 
protocols for systems of size iV < 4. We also show that for a qutrit the locally optimal protocol is 
the optimal protocol for observables with equispaced eigenvalues, providing the first fully optimal 
feedback protocol for a 3-state system. 
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Observation and control of coherent quantum behav- 
ior has been realized in a variety of mesoscopic de- 
vices [H, [H, S 0] • With further refinements, such devices 
may well form the basis of new technologies, for exam- 
ple in sensing Q and communication Feedback, 
in which a system is continuously observed and the in- 
formation used to control its behavior in the presence 
of noise, is an important element in the quantum engi- 
neer's toolbox @, 0, B @, [IS 11 1- In view of this, one 
would like to know the limits on such control, given any 
relevant limitations on the measurement and/or control 
forces. However, except in special cases 12[, the dynam- 
ics of continuously observed quantum systems is nonlin- 
ear. Further, results on the quantum-to-classical tran- 
sition show that this nonlinear dynamics, described by 
stochastic master equations (SME's), is necessarily every 
bit as complex (and chaotic) as that of nonlinear classi- 
cal systems |l3[. Because of this, fully general and exact 
results regarding optimal quantum feedback are unlikely 
to exist; certainly no such results have been found for 
nonlinear classical systems 14J. Nevertheless, one would 
like to obtain results that give insights applicable across 
a range of systems. 

Quantum feedback control is implemented by modify- 
ing a "control" Hamiltonian, H , that is some part of the 
system Hamiltonian. Here we will examine feedback pro- 
tocols in the regime where the controls are able to keep 
the system close to a pure state. This is an important 
regime, both because it is where many quantum control 
systems will need to operate, and because it allows one 
to simplify the problem by using a power series expan- 
sion [15j. In addition to working in the regime of good 
control, we make two further simplifications. The first 
is that the control is strong - that is, that 1) the only 
constraint on H is that Tr[i/ 2 ] < p? for some constant 
jj,, and 2) that H can induce dynamics much faster than 
both the dynamics of the system and the rate at which 
the measurement extracts information. This means that 
H is effectively unconstrained. We thus deal strictly with 
a subset of the regime of good control, defined by [i ^> k 
and k f3. Here k is the strength of the measurement 
(defined precisely below), and (3 is the noise strength, 



which we define as the rate of increase of the linear en- 
tropy due to the noise. The latter inequality is essential 
for good control. This regime is applicable, for exam- 
ple, to mesoscopic superconducting systems [H, 0, H, 0j, 
such as coupled Cooper-pair boxes. Here the speed of 
control rotations is typically 1-10 GHz Q, and that of 
decoherence is 10 6 s _1 [l6|. Measuring these at a rate 
k = 5 x 10 7 s _1 is reasonable [l7j . and falls in the above 
regime. 

Our second simplification is to seek control protocols 
that give the maximum increase in the control objective 
in each time-step separately — that is, that are locally 
optimal in time. However, we will find that for a qutrit, 
the locally optimal protocol (LOP) is the optimal proto- 
col for observables with equispaced eigenvalues. 

We will allow the controller to measure a single observ- 
able, X. Since the control allows us to perform all unitary 
operations, and since transforming the system is equiva- 
lent to transforming the observable being measured, X, 
this allows the controller to measure all observables of 
the form X u = UXU\ for any unitary U. Since the 
control Hamiltonian is not limited, the only constraint 
on the controller is the rate at which the measurement 
extracts information (the measurement strength, k). 

A sensible and widely applicable control objective is 
to maximize the probability, P, that the system will be 
found in a desired pure state (referred to as the target 
state) at a given time T (called the horizon time). This 
objective also allows one to maximize P in the steady- 
state, and is the objective we will consider here. 

In what follows we will denote the state of the quantum 
system by the density matrix p, and the N eigenvalues 
of p as Xi. We place these in decreasing order so that 
Aj > Aj+i. Since the control dynamics is fast, at any time 
T we can apply H to quickly rotate p so as to maximize P 
at that time. This means rotating p so that the eigenstate 
corresponding to Ao is the target state, giving P = Xq. 
Thus the optimality of the control is determined entirely 
by the eigenvalues A^. (Since the control Hamiltonian 
cannot change these eigenvalues, the only further role 
of H is to set the observable to be measured at each 
time, X u (t).) The probability that the state is found 
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in the target state at a given future time, is thus the 
average of Ao over all future trajectories at that time: 
P(t) = (Ao(t)}. Note also that because the state is near- 
pure, we can write Ao = 1 — A, with 4 < 1. Thus 
P = 1 - (A(i)), with (A(T)) the error probability. 

By definition, the locally optimal feedback protocol is 
the one that maximizes the rate of reduction of (A) at 
each time-step. To find the LOP, we must find the ob- 
servable, X u , that maximizes the rate of reduction of 
(A) for any p. To derive the equation of motion for (A), 
we start with the SME for the density matrix under a 
continuous measurement of X u : 



dp = -k[X u , [X u ,p]]dt + V2k(X u p + pX u -2(X u )p)dW. 

(1) 

Here the assumption of strong feedback allows us to drop 
any system Hamiltonian, and dW is Gaussian (Wiener) 
noise, satisfying (dW) = and dW 2 = dt. Note that this 
SME does not include noise; we exclude noise in what 
follows except when wc calculate results for the steady- 
state. To obtain the equation of motion for (A), to first 
order in A, we first note that A = (1 — Tr[p 2 ])/2 (that 
is, half the linear entropy), to first order in A [25[. We 
calculate the derivative of Tr[/? 2 ] directly from the SME, 
and then expand the result in powers of A. This gives 

dA = -8k^2K\X^\ 2 dt-VSk(AX^-J2\X^dW,(2) 

where X% m = (n\X u \m). The equation of motion for 
(A) is given by averaging this equation over dW. Thus 
<A)=-8fc£, i#0 A«| 2 . ^ 
We now prove the following theorem: 



Theorem 1. Define X as Hermitian, U as unitary, 
and p as a density operator of dimension N. With- 
out loss of generality we set X and p to be diagonal, 
arrange the eigenvalues of p, Xi, in decreasing order, 
and arrange the eigenvalues of X so that the two ex- 
treme values are in the first 2-by-2 block, correspond- 
ing to the two largest eigenvalues of p ]2a l. The max- 
imum ofF(U) = £ i7 , Xi\^UXV^\0)\ s = £^ Wol' 
is achieved if 



u = U opt = C/ 2 U ® V, 



(3) 



where U2 is any 2-by-2 unitary unbiased w.r.t the basis 
{(1,0), (0,1)}: 



p i4> f p i0i 
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(4) 



and V is any unitary with dimension N — 2. 

Proof. Wc first derive an upper bound on F(U). This is 
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F(U) < Ai J2 l^ol 2 = Ax 



N-l 
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AiVar (X,U^\0 >) < Ai 
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Here Var(X, denotes the variance of X in the state 
\tp) . The first inequality is immediate, and the last is well- 
known (lij . Since U Qpt saturates this bound, it achieves 
the maximum. That only unitaries of the above form 
achieve the maximum is simplest to show when the eigen- 
values of p are non-degenerate: to saturate the first in- 
equality one must restrict U to the subspace spanned by 
{|0),|1)}, and to achieve the last, U must be unbiased 
w.r.t to the eigenbasis. When the eigenvalues of p are 
degenerate, a careful analysis shows that this remains 
true [23|. □ 

The remarkably simple form of U op t tells us that to 
obtain the fastest reduction in (A) at each time, t, and 
thus realize an LOP, we must choose X n at each time to 
concentrate the distinguishing power of the measurement 
entirely on the largest two eigenvalues of p at that time, 
and measure in a basis that is unbiased with respect to 
the corresponding eigenvectors. It also tells us that the 
maximum achievable rate of reduction is 

(A) = -8fc(A 1 )|X u 1 | 2 = -2fc(A!)(x max - x min ) 2 , (6) 

where Ai is the second largest eigenvalue of p(t) , and x max 
and a; m i n are the maximum and minimum eigenvalues of 
X. Note that (Ai) also decreases at the rate (A). We 
have complete freedom in choosing the unitary submatrix 
V, as it has no effect on (A). However, V also has no 
effect on Ai; V induces transition rates only between the 
(N — 2) smallest eigenvalues. 

We can now obtain a lower bound on the performance 
of LOP's for any system. Whatever the choice of V, 
(Ai) will always be greater than or equal to (A) / (N — 1). 
We therefore have (A) < -[8/(N - l)]k(A)\X^\\ 2 , and 
thus in the absence of noise, throughout the evolution 
the error probability will satisfy 



<A(t)> < Aoe- 2 **** 8 ) 8 ^- 1 *. 



(7) 



max ^min) 



(5) 



where Sx = x max - x min . 

In the presence of noise, the important quantity is 
the steady-state error probability, (A) ss , and we can re- 
derive the lower bound on this given in [l5j |. In the 
worst case, V leaves the N — 2 smallest eigenvalues un- 
changed, so that under isotropic noise all the small eigen- 
values remain identical once homogenized by the action 
of the LOP. The equation of motion for (A) ss is then 
(A) = -8k(A)\X^\\ 2 /(N - 1) + (3/2 (recall that /3 is the 
noise strength). This gives (A) ss = [f3(N - l)]/[Ak{5x) 2 ], 
a lower bound on the performance of all LOP's with 
isotropic noise. 

We further have the nice result that, for qubits and 
qutrits, the two lower bounds just derived are tight - here 
they give the performance of the best LOP's because the 
action of V is trivial for N < 4. 

For TV > 4, to obtain the best LOP, one would need 
to choose V to continually minimize the entropy of the 
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smallest N — 2 eigenvalues in such a way as to allow one 
to generate the largest possible value of | (A) | at all future 
times. Using time-independent perturbation theory [20| . 
we can derive from Eq. |T]) the equations of motion for all 
the small eigenvalues. These are 

d\i = F i (X,X u )dt + a i (X,X n )dW > (8) 

where A = (Ai, . . . , Ajv-i), and 



F, 



E 



u |2 | 

^ = V8fcA,(A u -X^), 



^i^j | vu I 



(9) 



for i = 1,...,N — 1. Note that under locally opti- 
mal control, the equation for (Ai) reduces to d{\\) = 
-8fc(A 1 )|X " 1 | 2 . 

The equations for the small eigenvalues arc nonlinear. 
As a result of this, in general in finding the optimal V, 
one cannot easily eliminate the stochastic terms as we 
have been able to in the analysis so far, even though we 
are interested purely in the average value of A. Never- 
theless for N = 4 we can obtain the optimal V by com- 
bining the above results with those of 0], which shows 
that the maximal increase in the largest eigenvalue of a 
qubit is obtained when the observable is unbiased with 
respect to the eigenvectors. Since the best thing we can 
do, given that we continually maximize dS, is to separate 
the two smallest eigenvalues as rapidly as possible, the 
result in [§] tells us that V must be unbiased with respect 
to the eigenvectors of the two smallest eigenvalues. We 
now label the eigenvalues of X in decreasing order as a;, . 
Because the SME is invariant under the transformation 
X — > X + ai (a real), we add a constant to X so that 
x± = — 27, without loss of generality. The best locally 
optimal control is then achieved by 



X u 








• : 




V xi 
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(10) 



where 2d = X2 + X3, 2c = x 2 — X3, and x\ > c. If the 
eigenvalues of X u are equally spaced, then d = and all 
stochastic terms vanish. In this case (A) = A, and the 
equations for the system, excluding noise, are 



Ai 



A 



A 2 = -A 3 = 8fcc 2 A 2 A 3 /(A 2 - A 3 



(11) 
(12) 



where A = J2 i= i ^« = ^l- Even though these equa- 
tions are nonlinear, it is possible to obtain an analytic 
expression for the behavior of A once certain transients 
have died away. To do this note that the LOP first 
equalizes Ai and A 2 , and then must rapidly and repeat- 
edly swap them. As a result they remain equal, and 
their derivatives become the average of Ai and A 2 above. 
Next, calculating the derivative of the ratio R = A1/A3, 



we find that for x\ > V2c, R stabilizes at the value 
R ss = (x 2 + c 2 )/(x 2 — c 2 ). Once this has happened, the 
equation for A reduces to the simple exponential decay 
A = — 7A, with the rate 

7 = [4fc/(3c 2 )](a; 2 -c 2 )(a; 2 -2c 2 ) , xi > V2c. (13) 

For c < x\ < \J~2c, after a time such that R 1, the 
result is also exponential decay, but with 7 = Akx 2 . 

We have now found the best locally optimal proto- 
cols for N — 3 and 4, but in each case the LOP is 
not necessarily the optimal protocol. We will now ex- 
amine the LOP for a qutrit and show that under cer- 
tain conditions it is the optimal protocol. Before we 
do this, we note that we can use the theorem above to 
place an upper bound on the performance of any proto- 
col for all systems in the regime of good control. Since 
max((A)) = -2fc(Ai)(a; max -a; min ) 2 , and (Ai) < (A), the 
steady-state error probability for any protocol satisfies 



max -£min ) 



(14) 



where /3 is once again the noise strength. This is true for 
any noise process, isotropic or otherwise. 

We now analyze the case of a qutrit when X has equally 
spaced eigenvalues. As usual we denote these as 11 > 
X2 > X3. We also add a constant to X so that x% = —X\ 
and X2 = 0. With these definitions, the LOP for a single 
qutrit involves choosing U so that 



A7 




(15) 



where q = (x\ —X3)/2 = x\. This generates the evolution 
(Ai(i), A 2 ) = (A?e _7t , A°), where \\ and A^ arc the initial 
eigenvalues of Ai and A 2 , and we have defined 7 = 8kq 2 . 
This measurement is applied until \i(t) = A 2 , which oc- 
curs after a time r = In (Aj/A!])/^. At this point the 
LOP changes abruptly, and involves rapidly switching 
the measurement between A" and X% = Of\_\ v X^O\ iv , 
where Om P swaps the eigenstates of Ai and A 2 . In 
the limit of fast switching, this generates the evolution 
(Ai(i), \ 2 {t)) = (Ai(t), A 2 (r))e- 7t / 2 . Denoting now the 
initial time by t, the error probability under the LOP at 
the final time T (the horizon time) is thus 

A LOP (A,i,T) = Aie"^-*) + A 2 , T < t + r, 
A LOP (A,i,T) = 2^K~ 2 e-^ T - t)/2 , T>t + r, 

where we have defined Ai = Ai(i), A 2 = A 2 (i) and 
A = (Ai,A 2 ). In optimal control theory, the quantity 
we wish to minimize, as a function of the initial and fi- 
nal times, is called the cost function. Having an explicit 
expression for the cost function generated by the LOP, 
ALOp(A,i,T), we can now use the verification theorems 
of optimal control theory to determine whether the LOP 
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is the optimal protocol [2l|, |22j. We first consider the 
case when T < t + r. For the LOP to be optimal, the 
cost function must satisfy the Hamilton- Jacobi-Belhnan 
(HJB) equation corresponding to the dynamical equa- 
tions for the system (Eq.©) [221 ] : 



<9A 

— — = max 

at x u (t) 



d 2 A 



2 dAidAj 



(16) 



To check that the cost function is a solution to this equa- 
tion, one substitutes in Alop for T < £ + t on the RHS, 
and then optimizes this at each time s with respect to 
X u (being the set of control parameters). We must check 
that Alop is a solution to the HJB, and that the maxi- 
mum on the RHS is realized when X n (t) is precisely that 
prescribed by the LOP. Performing the substitution, we 
find that the RHS is 



max \({t)\X? 



2 + ra 2 



v(t)\X^\ 2 } (8fcA a ), (17) 



where ( = (K x / K 2 )e~^ T ~^ > 1 and r)(t) < 1. Note that 
7 is already fixed by the LOP, and thus does not take part 
in the optimization. We performed this maximization 
over X n numerically, and verified that whenever £ > 1 > 
rj, the maximum is obtained by the locally optimal U. 



Thus IX&I = q and |X 2 U 0| 



= TO = 0. 



The RHS of 



the HJB equation is therefore r ykie~ 1 ^ T ~ t \ and this is 
indeed equal to 9Alop/(<9<), for T < t + r, being the 
LHS. Since the derivatives of Alop that appear in the 
HJB equation are all continuous for T < t + r (the final 
requirement of the verification theorem), the LOP is the 
optimal protocol for T < t + t. 

To determine whether the LOP is optimal for T > t+T, 
we note that the derivatives of Alop a- re not continuous 
at t = T — t. As a result the classic verification theorem 
employed above no longer applies; we need a new veri- 
fication theorem, developed in the last decade [23[, that 
uses generalized solutions of second-order partial differen- 
tial equations, referred to as viscosity solutions [24| . Ap- 
plying this "viscosity" verification theorem to the LOP 
protocol for a qutrit shows that it remains optimal for 
T > t + r. Since viscosity solutions will be unfamiliar to 
most readers, the details of this analysis will be presented 
elsewhere. We have also performed the analysis for the 
case when the eigenvalues of X are not equally spaced, 
and in this case we find that the locally optimal protocol 
is not globally optimal. 

In summary, we have found the class of all locally op- 
timal feedback protocols in the regimes of good control 
and strong feedback, and obtained explicit expressions 
for the best of these for N = 3 and N = 4. We have 
also shown that the former is globally optimal for some, 
but not for all, observables. The question of how to beat 
the LOP for a single qutrit when it is not optimal is an 
interesting one, and will be the subject of future work. 
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